JOHNSY K. JOHN

121 E Tasman Dr. Apt 258, San Jose, CA - 95134.

**Phone:** (201) 532 0032

**Email:** johnsyjohnk@gmail.com

**LinkedIn:** [www.linkedin.com/in/johnsyjohnk](http://www.linkedin.com/in/johnsyjohnk)

**SUMMARY**

Highly motivated and experienced ASIC hardware logic design professional seeking a challenging role in CPU/GPU micro-architecture, multi-core processor, router, and fabric architecture research and development. Passionate about exploring innovative architectural and micro-architectural solutions, with a focus on power-optimized performance enhancement techniques.

**KEY ACCOMPLISHMENTS**

* **Extensive Industry Experience:** 16 years of expertise in ASIC, processor/CPU/GPU/router/fabric micro-architecture and RTL development.
* **Security Clearance:** Hold an active US federal security clearance at the secret level.
* **Superconducting RQL Expertise:** 3.5 years of research and development experience in superconducting reciprocal quantum logic and micro-architecture.
* **Optoelectronics Experience:** 2 years of industry experience in optoelectronics product development.
* **Academic Research Background:** 3 years of academic research experience in processor architecture.
* **Intellectual Property Awards:** Recipient of three patents and four trade secret awards.
* **Published Research:** Authored three papers in international conferences related to computer architecture.
* **Technical Innovations:** Developed a trace analyzer tool for analyzing instruction fetch, prefetch, and branch prediction algorithms and mechanisms.

TECHNICAL EXPERIENCE:

**Senior Design Engineer, AI silicon engineering group at Microsoft Corporation Mountain View, CA.**

October 2024 – Preset.

Design of the tensor compute logic.

**Staff Design Engineer at Intel Corporation, Santa Clara, CA.**

January 2022 – October 2024.

Design and development of PCIE exerciser chips: experimental internal projects.

**Micro-architect/ Staff Design Engineer at Northrop Grumman, Linthicum Heights, MD**

August 2018 – January 2022: Staff Design Engineer.

December 2018 – July 2020: Senior Principal Engineer.

July 2018 – November 2018: Design Engineer T4.

* Worked on processor ASIC IP and SOC microarchitecture, specification, RTL development, integration, debug, bug fixes, lint, CDC and synthesis checks.
* Worked on GPU/accelerator trade-off studies, microarchitecture specification and RTL development for cold CMOS/RQL technology. This includes the development of vector execution unit and register files based on AMD GCN3 ISA.
* Worked on developing novel experimental digital microarchitectures in RTL from scratch using reciprocal quantum logic technology for cryogenic superconducting computing. This feasibility study included gate count optimized wave pipeline implementation of 16 bit FP / Integer SIMD, compute cores, ALU, LBIST, JTAG, serial data communication, router, fabric, and functional simulation models of libraries using Verilog UDPs.
* Chip lead and RTL designer for various experimental test chips with responsibilities ranging from concepts, microarchitecture, specification and test plan development, test vector generation test vector validation, regressions and tape out to lab test. Developed designs and tests to study margins, yield, reliability, bit error rate, clock generation, process variation etc. through 2D shmoo plots. Completed 8 test chip designs. Working on technology and flow bring up and validation.
* Filed four patents / trade secrets.
* Developed Verilog UDP based functional models of super conducting digital memory sub circuits.
* Developed custom git repository based release flow to accelerate the design and verification process. Owned the repository and release flow for many designs.

**Micro-architect/Design Engineer at Intel Federal LLC, Hudson, MA**

January 2017 – July 2018.

* Data intensive SIMD accelerator/processor core, ALU, torus router and multi-dimensional fabric/interconnect micro-architecture – path-finding/research/ RTL development from scratch, static timing analysis, power analysis, performance and power optimizations. Developed unit level test benches to validate the functionality and to generate the test vectors.
* SOC RTL logic design and integration: massively parallel core multi-dimensional 4D torus fabric micro-architecture. Developed a fabric system Verilog code generation flow based on the interconnect and routing algorithm.

**Developer – Data Compression Software Algorithms, Exabolt (self-funded passion project).**

October 2015 – July 2018.

* Development of lossless data compression software algorithms for video/data files and interactive infrastructure.

**Senior Design Engineer – Micro-architecture/RTL, AMD Boston Design Center, Boxborough, MA**

August, 2007 – October, 2015.

Micro-architecture development in RTL/implementation of instruction fetch, instruction/data caches in high performance microprocessor, working closely with Design, and verification teams.

Experience:

* Instruction fetch and cache logic development in RTL.
* Cache repair logic / Mbist RTL development.
* RAS – Error detection, reporting and logging RTL IP development.
* Instruction Based Sampling (IBS) and Performance Monitor Counter (PMC) modeling.
* Multi-level TLB’s.
* Instruction prefetcher and prediction queues, branch prediction mechanisms.
* Developed a retire order trace analyzer to evaluate the fetch, prefetch and branch prediction algorithms and mechanisms.
* Timing and power optimizations, functional debug.
* Support verification and physical design teams.
* Development of verilog RTL models of various configurable cache array macros, flop and latch arrays based on the specifications. Optimized the macro array RTLs for improving Verilog simulation performance.
* Formal Verification (Logical Equivalency Check) on the array macro RTLs with the Schematic.
* Worked with RTL, Schematic and Verification team on fixing logical/functional errors in macro schematic and RTL.

**Intern Design Engineer, Qualcomm CDMA Technologies, Processor Design Center, Austin, TX**

September 25th, 2006 – May 31st, 2007.

**Logic design, Interleaved Multi-Threaded VLIW processor.**

Experience:

* Added local and regional clock gating for power optimization in TLB unit, analysis done using power theater.
* RTL level Power analysis of Data Cache unit.
* Conversion of a JTAG module from VHDL to Verilog and equivalency check done using Conformal LEC.
* Feasibility study of Power Theater (a tool to calculate dynamic power) in different RTL hierarchical levels. Extensively used Power Theater tool to extract useful information on register toggle rate, ungated and gated powers. Analyzed dynamic powers at core level and unit levels across various benchmarks.
* Comparative study on behavioral Vs structural coding in Verilog to see the impact on code readability, gate count, power, critical path timing, area etc. Analysis done on Instruction Cache unit by converting major portion of the behavioral Verilog code to structural.

**Process Analyst, Agere Systems, Breinigsville, PA**

* **Dec 1999 – Oct 2001**: Process Analyst, Optoelectronics Division of Lucent Technologies, Breinigsville, PA (later known as Agere Systems), Optoelectronics Product Development Group.

Performed test and analysis of Dense Wave-Division Multiplexing (DWDM) systems such as 40-Channel Arrayed Waveguide Grating Multiplexer/Demultiplexer, High Speed optical switches and routers (40 GB), 40 Gbps Lithium Niobate Electro-Optic Modulators etc. using Test Benches and in house built software’s. Have Wafer Fabrication Class 100 clean room experience.

**TECHNICAL SKILLS**

**Programming and software tools:**

Verilog, System Verilog, UVM, Verilog UDPs, VHDL, C, C++, Bash, C Shell, Perl, Python.

**Hardware/ Low Power Logic Design tools:**

Synopsys Verdi, Cadence Xcelium logic simulators.

Cadence Genus/Innovus,

Synopsys DC, Static timing analysis, power analysis,

Cadence: Jasper Gold Lint/CDC, Encounter Conformal/ Verplex LEC – Equivalency Check

Sequence Design: Power Theater – RTL level low power analysis

Altera Quartus

**Computer Architecture:**

**CPU/GPU microarchitecture, Instruction fetch, prefetch, branch prediction,**

**SIMD (Integer/FP), Register Banks, ALU, LBIST, MBIST JTAG, Caches, Processor cores,**

**Thread interleaving, in-order/out-of-order superscalar processors,**

**path finding studies, test chip design, development and testing, etc.**

**ISAs: X86/AMD64/RISC-V/GCN3.**

**MIAOW open source GPU core:** Pathfinding and optimizations based on the open source model.

**Trace Analyzer:** Developed a CPU trace analyzer in C++ to study the instruction fetch and

data cache behavior and to experiment various Micro-architectural mechanisms such as branch predictors, instruction prefetchers, instruction cache, data caches, miss address buffers, victim caches, hot set predictors etc.

**MARSSX86:** A tool for cycle accurate full system simulation of the x86-64 architecture.

**Simplesim-3.0:** Superscalar out-order processor performance modeling tool.

**Tools and Operating Systems:**

UNIX, Windows, Microsoft Office, Microsoft Visio.

Git linux based repository and release flow.

Bit Bucket, Jira.

**PATENTS AND TRADE SECRETS GRANTED/FILED/PENDING:**

1. Title: Detecting and Correcting Hard Errors in a Memory Array.

Date of Filing: 10/8/2013, Filing Number: 14/048830, Grant date: 11/17/2015

Inventor(s): Johnsy Kanjirapallil John, John Kalamatianos, R. Gelinas, P. Nevius, Vilas Sridharan.

1. Title: Dynamic remapping of Cache Lines.

Date of Filing: 04/15/2014, Filing Number: 14/253785, Grant date: 08/23/2016

Inventor(s): John Kalamatianos, Johnsy Kanjirapallil John, Robert Gelinas, Phillip Nevius.

1. Title: Increase Cache Associativity Using Hot Set Detection.

Date of Filing: 08/22/2016, Filing Number: 15/243921.

Inventors: John Kalamatianos, Adithya Yalavarti, Johnsy Kanjirapallil John.

1. Title: Synchronous logical clock generation using wave pipelined pulse generators and reset in RQL Filed as trade secret.
2. Title: Self-correcting/initializing logical clock generation using wave pipelined pulse generators in RQL. Filed as trade secret.
3. Title: Targeted vector generation for simulation and test optimization of RQL memory structures. (Co-inventor). Filed as trade secret.
4. Title: Segmented Barrel Shifter. Filed as trade secret.

**RESEARCH PUBLICATIONS:**

1. J. Kong, Johnsy K. John, S. W. Chung and J.S. Hu. **On the Thermal Attack in Instruction Caches.** *In IEEE Tran on Dependable and Secure Computing (TDSC), Volume 7, No. 2, pp. 217 – 223, April 2010.*
2. Johnsy K. John, J.S. Hu, and S.G. Ziavras. **Optimizing the Thermal Behavior of Sub arrayed Data Caches**. In *Proc. of IEEE Int. Conf. of Computer Design (ICCD)*, pp. 625 - 630, San Jose, CA, October 2-5, 2005.
3. J. S. Hu, G. M. Link, Johnsy K. John, S. Wang, and S. G. Ziavras. **Resource-Driven Optimizations for Transient-Fault Detecting Superscalar Micro-architectures**. In *Proc. of 10th Asia-Pacific Computer Systems Architecture Conf. (ACSAC)*, pp. 200 - 214, Singapore, October 24-26, 2005.
4. “**A novel algorithm and Micro-architecture for finding partial fraction expansion coefficients**”, Edwin Hou, Johnsy K. John and Nitesh B. Guinde.

**ACADEMIC BACKGROUND**

**New Jersey Institute of Technology, Newark, NJ**

**PhD Candidate/Course Work Completed in Computer Engineering, (Thesis pending) CGPA: 3.66, 09/2004 – 05/2006.**

Research focus: Enhancing processor fetch and data throughput efficiency through branch predictor re-architecture.

**New Jersey Institute of Technology, Newark, NJ**

**Computer Engineering, Masters, CGPA: 3.65, 9/2001 - 01/2003**

Specialization: Digital Systems design, Arithmetic Systems architecture.

**Rajiv Gandhi Institute of Technology, Mahatma Gandhi University, Kottayam, India**

**Electronics and Communication Engineering, Bachelors, CGPA: 3.65, 8/1995 - 5/1999**

## PROFILE DETAILS

US Citizen

Active Secret-level US Federal Security Clearance

Available to start within 3 months